stats: make the stat name of endpoint configurable#45044
Conversation
Signed-off-by: wbpcode/wangbaiping <wbphub@gmail.com>
|
CC @envoyproxy/api-shepherds: Your approval is needed for changes made to |
|
/retest |
| // This is useful when there are duplicate addresses in the cluster, for example when multiple | ||
| // endpoints share the same address but have different hostnames or metadata. | ||
| // In this case, the stat name can be used to differentiate between these endpoints in stats and logs. | ||
| string stat_name = 5; |
There was a problem hiding this comment.
The per-endpoint stats export two labels: the ip:port, and the hostname (if present). Which of these is this overriding? Or is it replacing both of those with a 3rd tag? Please document whatever the behavior is clearly.
| for (auto& host : host_set->hosts()) { | ||
| absl::string_view endpoint_observability_name = host->observabilityName(); | ||
| Network::Address::InstanceConstSharedPtr address; | ||
| if (endpoint_observability_name.empty()) { |
There was a problem hiding this comment.
observabilityName() returns address_->asStringView() as a fallback, so I think this condition is always false.
There was a problem hiding this comment.
There is a special case: LogicHost, whose address may be updated dynamically. That will not support the stat_name and also won't fallback to ip:port by default.
My plan is to remove logic host in the future because current default host implementation also could support multiple addresses.
Signed-off-by: wbpcode/wangbaiping <wbphub@gmail.com>
|
cc @ggreenway updated the comment to make it more clear. |
|
/retest |
Signed-off-by: code <wbphub@gmail.com>
|
/gemini review |
There was a problem hiding this comment.
Code Review
This pull request introduces a stat_name field to the Endpoint configuration, allowing for custom observability names in per-endpoint stats and logs. The changes include updates to the HostDescription interface and HostUtility to incorporate this field. Feedback identifies a potential null pointer dereference in host_utility.cc when handling unresolved logical hosts, documentation typos in the proto file, and a requirement to update unit tests to account for stat name sanitization.
There was a problem hiding this comment.
Pull request overview
This PR adds a configurable “observability name” for endpoints, allowing per-endpoint stats to use Endpoint.stat_name instead of always deriving the name from ip:port, which prevents metric collisions when duplicate endpoint addresses exist (e.g., across priorities).
Changes:
- Add
stat_nametoenvoy.config.endpoint.v3.Endpointand plumb it into host construction. - Introduce
HostDescription::observabilityName()and use it in per-endpoint stats naming/tagging. - Add/adjust unit tests and mocks to cover the new endpoint stat naming behavior.
Reviewed changes
Copilot reviewed 14 out of 14 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
api/envoy/config/endpoint/v3/endpoint_components.proto |
Adds Endpoint.stat_name field and associated API comments. |
envoy/upstream/host_description.h |
Extends HostDescription with observabilityName() API. |
source/common/upstream/upstream_impl.h |
Stores endpoint stat name on HostDescriptionImpl and exposes it via observabilityName(). |
source/common/upstream/upstream_impl.cc |
Wires Endpoint.stat_name into HostImpl/HostDescriptionImpl creation paths. |
source/common/upstream/host_utility.cc |
Uses host->observabilityName() for per-endpoint stat naming and envoy.endpoint_address tag value. |
source/common/upstream/health_discovery_service.cc |
Passes stat_name when constructing HDS hosts. |
source/extensions/clusters/common/logical_host.h |
Implements the new observabilityName() override for logical host types. |
test/mocks/upstream/host.h |
Updates host mocks to include/override observabilityName() and backing storage. |
test/mocks/upstream/host.cc |
Adds default ON_CALL behavior for observabilityName() in mocks. |
test/test_common/stats_utility.h |
Ensures test hosts have deterministic observability_name_ for per-endpoint metric tests. |
test/server/admin/stats_handler_speed_test.cc |
Updates speed-test mock host to implement observabilityName(). |
test/common/upstream/upstream_impl_test.cc |
Adds a unit test verifying Endpoint.stat_name is propagated to observabilityName(). |
test/common/upstream/host_utility_test.cc |
Adds a unit test verifying per-endpoint metric names/tags use the custom stat name. |
changelogs/current.yaml |
Adds release note entry for the new endpoint stat naming capability. |
Comments suppressed due to low confidence (1)
api/envoy/config/endpoint/v3/endpoint_components.proto:112
- This comment says stat_name can differentiate endpoints in “stats and logs”, but the current implementation appears to apply stat_name only to per-endpoint stats (no log sites use HostDescription::observabilityName yet). Consider narrowing this to stats/metrics to avoid overstating behavior.
// This is useful when there are duplicate addresses in the cluster, for example when multiple
// endpoints share the same address but have different hostnames or metadata.
// In this case, the stat name can be used to differentiate between these endpoints in stats and logs.
… dev-enhanc-host-stats
Signed-off-by: wbpcode/wangbaiping <wbphub@gmail.com>
Signed-off-by: wbpcode/wangbaiping <wbphub@gmail.com>
…ode/envoy into dev-enhanc-host-stats
|
/retest |
ggreenway
left a comment
There was a problem hiding this comment.
Mostly looks good, aside from minor docs issues
/wait
Co-authored-by: Greg Greenway <ggreenway@apple.com> Signed-off-by: code <wbphub@gmail.com>
Co-authored-by: Greg Greenway <ggreenway@apple.com> Signed-off-by: code <wbphub@gmail.com>
|
/retest |
|
/coverage |
|
Coverage for this Pull Request will be rendered here: https://storage.googleapis.com/envoy-cncf-pr/45044/coverage/index.html For comparison, current coverage on https://storage.googleapis.com/envoy-cncf-postsubmit/main/coverage/index.html The coverage results are (re-)rendered each time the CI |
|
(I'm not looking at coverage of this PR; needed a baseline for something else; ignore this) |
Commit Message: stats: make the stat name of endpoint configurable
Additional Description:
Make the stat name of endpoint configurable to close #44117. Before this PR, only the ip:port will be used as stat name in the per endpoint metrics. But if there are multiple endpoints with same ip:port, then the metrics will be corrupted.
(I initially want to add priority to per endpoint metrics as a tag to resolve this problem, but seems there is requirement to support duplicate endpoints in same priority, so I guess current way should be more flexible).
Risk Level: low.
Testing: unit.
Docs Changes: n/a.
Release Notes: added.
Platform Specific Features: n/a.